Entry Name:  “VT-WANG-MC2”

VAST Challenge 2015
Mini-Challenge 2

 

 

Team Members:

Ji Wang, Virginia Tech, wji@vt.edu       PRIMARY

Junpeng Wang, Virginia Tech, junpeng@vt.edu       

Chris North, Virginia Tech, north@vt.edu   

 

Student Team:   YES

 

Did you use data from both mini-challenges?  YES

 

Analytic Tools Used:

MoveView, developed by the team for the challenge.

Spectrum, developed by the team for the challenge.

Tableau

Gephi

 

Approximately how many hours were spent working on this submission in total?

150 Hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete?     Yes

 

 

Video Download

Video:

http://people.cs.vt.edu/~wji/VT-Wang-MC2.wmv

 

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1Identify those IDs that stand out for their large volumes of communication.  For each of these IDs

 

      a.         Characterize the communication patterns you see.

      b.        Based on these patterns, what do you hypothesize about these IDs?

 

Limit your response to no more than 4 images and 300 words.

 

In the communication data, we found there were three IDs that had significant large volumes of communication. We summarize their communication patterns and our hypothesis as following:

 

1.     ID: 1278894:

a.      Communication Patterns:

                                      i.     On Friday, Saturday and Sunday, the ID 1278894 broadcast messages to a lot of visitors in the park every 5 minutes in the following time slots: 12:00-12:55, 14:00-14:55, 16:00-16:55, 18:00-18:55, and 20:00-20:55. At the same time, ID 1278894 also received replies from visitors in all five areas during these 5 time slots. We also found that most of the replies came from the Wet Land area.

                                     ii.     Figure 1 shows that the frequency and the amount of messages that 1278894 received from other IDs in the park on Saturday. Similar patterns are also found on Friday and Sunday. It is obvious that there were 5 time slots of receiving messages regularly.

                                   iii.     Figure 1 also illustrates that the ID 1278894 broadcast message regularly on Friday, Saturday and Sunday with the same communication pattern, i.e. broadcasting a message every 5 minutes in these five time slots.

b.     We believe that ID 1278894 is a Park Service ID, as it broadcast messages regularly to interact with visitors in the park across three days.

n6 - Copy - Copy

 

2.     ID: 839736:

a.      Communication Patterns:

                                      i.     This ID regularly broadcast messages to most visitors in the park from 8:00 to 23:30 everyday. It sent out average around 30 messages per minute. At the same time, this ID also received messages from most visitors in the park across the three days.

b.     There are several unusual communication patterns we found about this ID.

                                      i.     This ID received a lot of messages from the visitors in the area of Wet Land. It received 1,573 messages from 316 visitors around 12:00. The spike decreased and lasted to around 12:30 (Figure 2).

                                     ii.     This ID, at 14:42 on Sunday, received more than 300 messages from more than 100 visitors (the peak is 116 in Figure 2) per minute. The spike appeared between 14:39 and 14:52. Most of the messages came from the area of Coaster Alley.

                                   iii.     Between 12:30 and 15:30, this ID received more messages than the average across the weekend (from more than 20 different visitors every minute) from Wet Land. (Figure 2)

                                   iv.     Around 12:03 on Sunday, this ID broadcast 1,417 messages per minute. This spike lasted till around 12:30 (Figure 3).

                                     v.     The second spike appeared around 14:42 (i.e. 2:42pm in Figure 3) on Sunday, it broadcast 326 messages per minute.

                                   vi.     Between 12:30 and 15:30 (i.e. between 12:30pm and 3:30pm of Figure 3), this ID sent out more messages than the average across the weekend (more than 50 messages).

c.      We believe that this ID is a Park Service ID. This ID broadcast announcements, alerts and updates to most visitors in the park everyday.

 

 

n3

n4

 

 

3.     External:

a.      Communication Patterns:

                                      i.     Normally, around 15-20 visitors in the park sent message to “External per minute everyday (Figure 4). The number of messages received by “External” was around 30 per minute.

                                     ii.     From 11:45 to 12:00 on Sunday, more than 160 (the peak value was 231) visitors sent messages to “External” per minute. This spike achieved around 11:59 and decreased significantly after 12:00.

b.      We believe that visitors sent messages out of the park to share their stories in the park. We also think that there was an incident happened in Wet Land around 11:45 on Sunday. And visitors in Wet Land sent out a large amount of messages to “External” to post/share this incident (Figure 4).

n2

 

 

 

MC2.2Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.

 

Limit your response to no more than 10 images and 1000 words.

 

1.     Park service communication (ID 1278894): Please check our answer in MC2.1 about ID 1278894

 

2.     Park service communication (ID 839736): Please check our answer in MC2.1 about ID 839736

 

3.     Communication pattern in large groups that group members always moved together:

The members in this type of large groups always moved together and communicated within group. We used a community detection algorithm (Louvain method) to detect the groups in the communication data.

                         i.         The size of these groups was around 40 people.

                       ii.         These groups usually came around 9:30 and were interested in Thrill Rides, Shows and Beer.

                      iii.         We found around 4 such groups on Friday; around 12 groups on Saturday; around 15 groups on Sunday.

                      iv.         We believe members in the group may belong to the same organization or they were organized by certain organization.  The following figure shows these groups on Sunday in Spectrum.

Description: n2

4.     Communication patterns of groups that group members moved separately but communicated closely within the group. From movement data, we found group members were separated into 2-3 subgroups. Different subgroups used different paths. Despite subgroups were using different paths; they communicate intensively via their devices.

 

                   Description: 2

 

5.     Communication patterns in small groups. These groups usually had around 2 people. Group members followed the same path in the park and they communicated a lot with each other.  We believe these groups were couples or small families. The following figure is the result from the visualization tool Spectrum and Gephi.

mc2

 

6.     By integrating the movement data, we found that there were several groups that always moved together in the park but never communicated within the group. A typical such group contains IDs as follows: 521750, 644885, 1080969, 1600469, 1629516, 1781070, 1787551 and 1935406. The following figure from Spectrum shows the group’s movement across three days. This group was only interested in location 63.

d1

 

 

 

 

MC2.3From this data, can you hypothesize when the crime was discovered?  Describe your rationale.

 

Limit your response to no more than 3 images and 300 words. 

1.     Based on the unusual pattern in communication data, we believe that the crime was discovered at location 32 (Pavilion) on Sunday between 11:45 and 12:00.

 

2.     On Sunday, around 12:00-12:05, messages sent to ID 839736 (we believe this ID is the Park Service ID) reached the peak. Most of these messages were sent from Wet Land Area (location 32). Please see Figure 2 in our answers to MC2.1. The following figure from MoveView reflects many messages were sent out from location 32 to Park Service ID 839736. In the figure, messages were sent from red end to green end.

 2

 

3.     On Sunday, between 12:00 and 12:30, 2:35pm and 3:05pm, ID 839736 (in Entry Corridor, most probably location 62) replied a lot of messages. Please see Figure 3 in our answers to MC2.1. The following figure from MoveView reflects many messages were sent out from Park Service ID 839736 to Location 32.

 3

 

4.     On Sunday, around 12:00, messages sent to “External” reached the peak. Most of the messages were sent from Wet Land area. Please see Figure 4 in our answers to MC2.1. The following figure from MoveView indicates many messages were sent out from Location 32 to “External”.

external

 

5.     We believe that the 6th group in our answers to MC2.2 is the group associated with the soccer star.  According to their movement pattern, they did not appear in the park on Sunday afternoon for the second Stage Show. This was probably due to the incident.